Recognition of Protein-coding Genes Based on Z-curve Algorithms

نویسندگان

  • Feng -Biao Guo
  • Yan Lin
  • Ling -Ling Chen
چکیده

Recognition of protein-coding genes, a classical bioinformatics issue, is an absolutely needed step for annotating newly sequenced genomes. The Z-curve algorithm, as one of the most effective methods on this issue, has been successfully applied in annotating or re-annotating many genomes, including those of bacteria, archaea and viruses. Two Z-curve based ab initio gene-finding programs have been developed: ZCURVE (for bacteria and archaea) and ZCURVE_V (for viruses and phages). ZCURVE_C (for 57 bacteria) and Zfisher (for any bacterium) are web servers for re-annotation of bacterial and archaeal genomes. The above four tools can be used for genome annotation or re-annotation, either independently or combined with the other gene-finding programs. In addition to recognizing protein-coding genes and exons, Z-curve algorithms are also effective in recognizing promoters and translation start sites. Here, we summarize the applications of Z-curve algorithms in gene finding and genome annotation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of various algorithms for recognizing short coding sequences of human genes

MOTIVATION Since the early 1980s of the twentieth century, there has been great progress in the development of computational gene-finding algorithms. Some problems, however, have not yet been solved currently. Recognizing short genes in prokaryotes and short exons in eukaryotes is one of such problems. The paper is devoted to assessing various algorithms, including those currently available and...

متن کامل

تخمین مکان نواحی کدکننده پروتئین در توالی عددی DNA با استفاده پنجره با طول متغیر بر مبنای منحنی سه بعدی Z

In recent years, estimation of protein-coding regions in numerical deoxyribonucleic acid (DNA) sequences using signal processing tools has been a challenging issue in bioinformatics, owing to their 3-base periodicity. Several digital signal processing (DSP) tools have been applied in order to Identify the task and concentrated on assigning numerical values to the symbolic DNA sequence, then app...

متن کامل

Graphical Abstracts

Genomic nucleotide composition features revealed by the Z-curve method. Upper panel, the 3-D Z-curve for human chromosome 6. Lower panel, protein-coding and non-coding ORFs are located in distinct regions in a high dimensional space spanned by Z-curve parameters for the R. solanacearum genome. Shown are the first 2 major components by PCA. Current Genomics, 2014, Vol. 15, No. 2 95 Recognition o...

متن کامل

A Novel Fast Algorithm for Exon Prediction in Eukaryotic Genes using Linear Predictive Coding Model and Goertzel Algorithm based on the Z-Curve

Punctual identification of protein-coding regions in Deoxyribonucleic Acid (DNA) sequences because of their 3-base periodicity has been a challenging issue in bioinformatics. Many DSP (Digital Signal Processing) techniques have been applied for identification task and concentrated on assigning numerical values to the symbolic DNA sequence and then applying spectral analysis tools such as the sh...

متن کامل

Long non-coding RNAs and their significance in human diseases

Protein-coding genes account for only a small fraction of the human genome and most of the genomic sequences are transcriptionally silent, but recent observations indicate significant functional elements, including non-coding protein transcripts in the human genome. Long non-coding RNAs (lncRNAs) have been defined as transcripts of >200 nucleotides without protein-coding capacity that perform t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2014